{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nQHAWIkmK08L"
   },
   "source": [
    "# **Hybrid Search**\n",
    "**BM25** is a sophisticated ranking function used in information retrieval. Acting like a highly efficient librarian, it excels in navigating through extensive collections of documents. Its effectiveness lies in term Frequency: Evaluating how often search terms appear in each document. Document Length Normalization: Ensuring a fair chance for both short and long documents in search results. Bias-Free Information Retrieval: Ideal for large data sets where unbiased results are critical. About LanceDB (VectorDB) LanceDB extends our search capabilities beyond mere keyword matching. It brings in a layer of contextual understanding, interpreting the semantics of search queries to provide results that align with the intended meaning\n",
    "\n",
    "**Hybrid Search Approach** - Our hybrid search system synergizes BM25's keyword-focused precision with LanceDB's semantic understanding. This duo delivers nuanced, comprehensive search results, perfect for complex and varied datasets."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "NhXas2ftK08N"
   },
   "source": [
    "## Installing all the dependencies"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "RRYSu48huSUW",
    "outputId": "5cc813a0-671a-4dbe-e7ec-ab7d75448ece"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.4/2.4 MB\u001b[0m \u001b[31m27.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
      "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.0/1.0 MB\u001b[0m \u001b[31m43.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
      "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m409.5/409.5 kB\u001b[0m \u001b[31m19.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
      "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.1/3.1 MB\u001b[0m \u001b[31m70.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
      "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.5/49.5 kB\u001b[0m \u001b[31m3.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
      "\u001b[?25h"
     ]
    }
   ],
   "source": [
    "!pip -q install langchain huggingface_hub langchain_community langchain_openai lancedb openai tiktoken rank_bm25 pypdf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "id": "dNA4TsHpu6OM"
   },
   "outputs": [],
   "source": [
    "# pass openai api key\n",
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"sk-proj-....\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IC9MvzU04XLL"
   },
   "source": [
    "### OpenSource Models\n",
    "https://github.com/lancedb/vectordb-recipes/blob/main/tutorials/chatbot_using_Llama2_&_lanceDB\n",
    "\n",
    "You can also compare your results with normal retriever vs ensemble retriever"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gJq7RFOw3ULM"
   },
   "source": [
    "## Hybrid Search"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "HqwsGJDhvAQ5"
   },
   "source": [
    "**BM25 Retriever** - Sparse retriever\n",
    "\n",
    "**Embeddings** - Dense retrievers Lancedb\n",
    "\n",
    "`Hybrid search = Sparse + Dense retriever`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "smNHWeu_K08P"
   },
   "source": [
    "## Load the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "s4hg4oaS0ssW",
    "outputId": "9e286890-924f-4d19-c2c1-518854b5e2bf"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--2024-11-24 07:35:55--  https://pdf.usaid.gov/pdf_docs/PA00TBCT.pdf\n",
      "Resolving pdf.usaid.gov (pdf.usaid.gov)... 96.17.46.187, 2600:1408:7:1b8::1923, 2600:1408:7:1b4::1923\n",
      "Connecting to pdf.usaid.gov (pdf.usaid.gov)|96.17.46.187|:443... connected.\n",
      "HTTP request sent, awaiting response... 200 OK\n",
      "Length: 6419525 (6.1M) [application/pdf]\n",
      "Saving to: ‘PA00TBCT.pdf’\n",
      "\n",
      "PA00TBCT.pdf        100%[===================>]   6.12M  --.-KB/s    in 0.1s    \n",
      "\n",
      "2024-11-24 07:35:55 (52.6 MB/s) - ‘PA00TBCT.pdf’ saved [6419525/6419525]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# download the pdf\n",
    "!wget https://pdf.usaid.gov/pdf_docs/PA00TBCT.pdf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "id": "xbWHSFWXK08P"
   },
   "outputs": [],
   "source": [
    "# load single pdf\n",
    "from langchain_community.document_loaders import PyPDFLoader\n",
    "\n",
    "loader = PyPDFLoader(\"/content/PA00TBCT.pdf\")\n",
    "pages = loader.load_and_split()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "bQDbbOWRK08Q"
   },
   "source": [
    "## Importing all the libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "id": "7gWOeb1KK08Q"
   },
   "outputs": [],
   "source": [
    "from langchain_community.vectorstores import LanceDB\n",
    "import lancedb\n",
    "from langchain.retrievers import BM25Retriever, EnsembleRetriever\n",
    "from langchain.schema import Document\n",
    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
    "from langchain.document_loaders import PyPDFLoader\n",
    "from langchain_openai import ChatOpenAI\n",
    "from langchain.chains import RetrievalQA"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "sRpdAIugK08Q"
   },
   "source": [
    "## Initialize Embeddings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "vTFiRfioK08Q",
    "outputId": "d92838de-b50b-4db7-b1f9-45ade0f5a88f"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "<ipython-input-8-4cb8ecc446d1>:2: LangChainDeprecationWarning: The class `OpenAIEmbeddings` was deprecated in LangChain 0.0.9 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-openai package and should be used instead. To use it run `pip install -U :class:`~langchain-openai` and import as `from :class:`~langchain_openai import OpenAIEmbeddings``.\n",
      "  embedding = OpenAIEmbeddings()\n"
     ]
    }
   ],
   "source": [
    "# Initialize embeddings\n",
    "embedding = OpenAIEmbeddings()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7JBAAIUfK08Q"
   },
   "source": [
    "## Initialize the BM25"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "FSnrZoVjK08Q",
    "outputId": "4ff63eef-0f97-4eed-cb14-fb59333b1db9"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "type of bm25 <class 'langchain_community.retrievers.bm25.BM25Retriever'>\n"
     ]
    }
   ],
   "source": [
    "# Initialize the BM25 retriever\n",
    "bm25_retriever = BM25Retriever.from_documents(pages)\n",
    "bm25_retriever.k = 2  # Retrieve top 2 results\n",
    "\n",
    "print(\"type of bm25\", type(bm25_retriever))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "KQ7gKiwxK08R"
   },
   "source": [
    "## Initialize the database"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "sbW5Mmv-K08R"
   },
   "outputs": [],
   "source": [
    "db = lancedb.connect(\"/tmp/lancedb\")\n",
    "table = db.create_table(\n",
    "    \"pandas_docs\",\n",
    "    data=[\n",
    "        {\n",
    "            \"vector\": embedding.embed_query(\"Hello World\"),\n",
    "            \"text\": \"Hello World\",\n",
    "            \"id\": \"1\",\n",
    "        }\n",
    "    ],\n",
    "    mode=\"overwrite\",\n",
    ")\n",
    "# docsearch = LanceDB.from_texts(doc_list, embedding, connection=table)\n",
    "# retriever_lancedb = docsearch.as_retriever(search_kwargs={\"k\": 2})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ELjUfEyXK08R"
   },
   "source": [
    "## Instantiate the retriever"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "id": "OqqpYi8IK08R"
   },
   "outputs": [],
   "source": [
    "# Initialize LanceDB retriever\n",
    "docsearch = LanceDB.from_documents(pages, embedding, connection=db)\n",
    "retriever_lancedb = docsearch.as_retriever(search_kwargs={\"k\": 2})\n",
    "\n",
    "# Initialize the ensemble retriever\n",
    "ensemble_retriever = EnsembleRetriever(\n",
    "    retrievers=[bm25_retriever, retriever_lancedb], weights=[0.2, 0.8]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "T3KdbcfzK08R"
   },
   "source": [
    "## Query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "pSdnMrR0ujeA",
    "outputId": "ccaaf688-ced5-4d02-f858-29017f5f4196"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Document(metadata={'page': 46, 'source': '/content/PA00TBCT.pdf'}, page_content='Food and Nutrition Handbook for Extension Workers\\n35\\nNutrition\\tfor\\tbreastfeeding\\tmothers\\nNutritional requirements during breastfeeding are higher than during \\npregnancy because the mother has to produce enough milk to sustain a \\nbaby (bigger than the foetus) for the first six months and beyond. Breast-\\nfeeding women need to eat a wide variety of foods.\\nNutrition guidelines for pregnant women as well apply here but a \\nlactating mother needs to eat much more; that is to say one extra meal \\n(five meals in total).\\nBreastfeeding mothers should also take a lot of fluids to cater for the \\nhigh amounts of water used to make breast milk. They should avoid \\nself-medication, smoking and alcohol to prevent intoxicating the baby.\\nBreastfeeding mothers should avoid stress and have enough rest.\\nKEY MESSAGES \\n• Ensure that a pregnant mother has a balanced diet, with a vari-\\nety of foods from the food groups, and has one additional meal \\nin addition to the 3 meals she receives daily. The fourth meal \\ncaters to her physiological needs.\\n• Pregnant women should take iron and folate tablets daily in \\naddition to foods rich in iron, calcium and vitamin A.\\nKEY MESSAGES \\n• Ensure that a breastfeeding mother take a balanced diet and in \\naddition to 3 meals daily receives 2 extra meals a day to main-\\ntain her health and that of her baby.\\n• A pregnant woman and breastfeeding mother should eat a \\nvariety of foods from the main food groups daily.'),\n",
       " Document(metadata={'page': 45, 'source': '/content/PA00TBCT.pdf'}, page_content='Food and Nutrition Handbook for Extension Workers\\n34\\nguidelines for selecting energy-giving foods, body-building foods \\nand protective foods. Pregnant women especially need foods rich in \\niron and vitamin A in addition to the balanced diet. Iron needs are \\nhighly increased partly due to the need to build reserves for child \\nup to six months after birth before initiating complementary food \\nintake.\\n• Pregnant women need to take foods rich in calcium, e.g., milk and \\nmukene (silver fish) partly to take care of the increased requirement \\nfor building the foetus skeletal structure.\\n• Pregnant women have higher needs for nutrients generally and \\nshould take snacks in between meals.\\nIn addition, pregnant women should be educated to strictly observe the \\nfollowing:\\n1. Take the required amounts of iron and folic acid supplements to \\nprevent anaemia.\\n2. Sleep under an insecticide-treated mosquito net.\\n3. Visit the nearest health facility at least four (4) times for antenatal \\ncare. This will enable them access a number of services that prepare \\nthem to deliver a healthy baby.\\n4. Deliver in a healthy facility with the help of a skilled health worker.\\n5. Get deworming pills, IPT and tetanus vaccine from a health facility.\\n6. Avoid excessive workloads therefore community and family support \\nmechanisms should be encouraged.\\n7. Pregnant women should limit intake of alcohol, cigarettes. These \\ncause negative effects on the foetus.\\n8. Should strictly take drugs on advice of the health personnel as some \\nof them are potentially harmful to the unborn child.\\n9. Avoid negative cultural practices that reduce the intake of nutritious \\nfoods or impact negatively on their health such as:\\n• Not consuming chicken and eggs.\\n• Pregnant women not defecating in toilets/pit latrines.'),\n",
       " Document(metadata={'source': '/content/PA00TBCT.pdf', 'page': 44}, page_content='Food and Nutrition Handbook for Extension Workers\\n33\\nCHAPTER FOUR\\nESSENTIAL NUTRITION ACTIONS IN \\nAGRICULTURE\\nT\\nhe Ministry of Agriculture, Animal Industry and Fisheries shares a \\nrole in executing essential nutrition actions. Those areas where the \\nministry of agriculture can contribute towards nutrition improvement \\nare:\\n• Promoting control of anaemia.\\n• Promoting production and consumption of iron-rich foods.\\n• Promoting production and consumption of vitamin A-rich foods.\\n• Promoting consumption of iodized salt.\\n• Promoting vitamin A supplementation.\\n• Ensuring adequate intake of quality food for the household mem-\\nbers.\\n• Reduction of women workload in agriculture.\\nTherefore, consistent with these actions, the Ministry is concerned with \\nnutrition for pregnant mothers, breastfeeding mothers and children \\nbelow five years.\\nNutrition\\tfor\\tpregnant\\twomen\\nIt is necessary that a woman is well nourished before pregnancy so that \\nby the time she conceives, the body has sufficient capacity to meet both \\nher and the baby’s needs. A malnourished woman may fail to deliver \\nbaby alive or if she does, the baby is likely to be underweight (the normal \\nrange is 2.5–4.5 kg at birth). One of the leading causes of maternal death \\nat childbirth is insufficient blood.\\nDuring pregnancy women have high nutrient needs because they have \\nto build foetus tissue, build reserves for breast milk and also cater for \\ntheir own nutritional needs. On average women should gain 8 –12 kg in \\nthe course of pregnancy. Pregnant women need to eat more food rather \\nthan decrease the intake.\\n• Pregnant women need to consume balanced diet following the'),\n",
       " Document(metadata={'source': '/content/PA00TBCT.pdf', 'page': 31}, page_content='Food and Nutrition Handbook for Extension Workers\\n20\\nPrevalence\\tof\\tmalnutrition\\tin\\tUganda\\nMalnutrition is one of the main public health and economic and devel -\\nopment problems facing Uganda. Children below the age of five years \\nand women in reproductive age including pregnant women and lactating \\nmothers are mostly affected (UDHS 2011). Children below the age of 5 \\nyears suffer mostly from under nutrition with:\\n• 33% of these children suffer from chronic undernutrition (they are \\nstunted)\\n• 14% are underweight (body weight too light for their age)\\n• 49% suffer from iron deficiency anaemia (lack of iron/blood)\\n• 60% suffer from different forms of iodine deficiency disorders (IDD) \\nLikewise women in reproductive age (15–49 years) also suffer from \\nmalnutrition:\\n• 52% of pregnant women and lactating mothers have vitamin A defi-\\nciency\\n• 23% suffered from iron deficiency anaemia\\n \\nFigure\\t1.\\tSummary\\tof\\ttypes\\tand\\tcategories\\tof\\tmalnutrition')]"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Example customer query\n",
    "query = \"what nutrition needed for pregnant women ?\"\n",
    "\n",
    "\n",
    "# Retrieve relevant documents/products\n",
    "docs = ensemble_retriever.get_relevant_documents(query)\n",
    "\n",
    "# Extract and print only the page content from each document\n",
    "# for doc in docs:\n",
    "#     print(doc.page_content)\n",
    "\n",
    "docs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1DIpWoPOvykU"
   },
   "source": [
    "## Ask questions on this retriever doc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 90
    },
    "id": "dxISRtkPuw1e",
    "outputId": "2163f5f1-cb28-4845-e621-b865556643a8"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "<ipython-input-26-ad04468c4fb1>:9: LangChainDeprecationWarning: The method `Chain.run` was deprecated in langchain 0.1.0 and will be removed in 1.0. Use :meth:`~invoke` instead.\n",
      "  qa.run(query)\n"
     ]
    },
    {
     "data": {
      "application/vnd.google.colaboratory.intrinsic+json": {
       "type": "string"
      },
      "text/plain": [
       "'Pregnant women need to consume a balanced diet with a variety of foods from the main food groups daily. They should include foods rich in iron, calcium, and vitamin A. Additionally, pregnant women should take iron and folate tablets daily, get adequate rest, avoid stress, and have regular antenatal care visits.'"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "llm = ChatOpenAI(openai_api_key=os.environ[\"OPENAI_API_KEY\"])\n",
    "\n",
    "qa = RetrievalQA.from_chain_type(\n",
    "    llm=llm, chain_type=\"stuff\", retriever=ensemble_retriever\n",
    ")\n",
    "\n",
    "\n",
    "query = \"what nutrition is needed for pregnant women  \"\n",
    "qa.run(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 90
    },
    "id": "dkGU_jKf2k6G",
    "outputId": "6834203c-ad36-411c-dde9-fb9f1e5d4c2e"
   },
   "outputs": [
    {
     "data": {
      "application/vnd.google.colaboratory.intrinsic+json": {
       "type": "string"
      },
      "text/plain": [
       "'Foods that are needed for building strong bones and teeth include sources of calcium, magnesium, vitamin D, and fluoride. Calcium and vitamin D are essential for bone health, while magnesium plays a role in bone structure. Fluoride is important for tooth formation and preventing tooth decay. Sources of these nutrients include:\\n\\n- Calcium: milk and dairy products, fish eaten with bones, dark green vegetables.\\n- Magnesium: legumes, whole-grain cereals, nuts, and dark-green vegetables.\\n- Vitamin D: sun exposure, Vitamin D-fortified milk, eggs, fatty fish.\\n- Fluoride: seafood, tea, coffee, soybeans, iodized salt.\\n\\nThese nutrients play crucial roles in building and maintaining strong bones and teeth.'"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query = \"which food needed for building strong bones and teeth ? which Vitamin & minerals importat for this? \"\n",
    "qa.run(query)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XNZdNIUM2etT"
   },
   "source": [
    "## Bonus\n",
    "### FTS is another important feature for extracting all info .. if any one word is matching"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vhnitNN7GqY-"
   },
   "source": [
    "**Usecase** : E-Commerce Product Search\n",
    "\n",
    "**Context**: Customers searching for products on an e-commerce website.\n",
    "\n",
    "Application: When a customer types a query (like \"fitness t-shirt\"), the system uses the ensemble retriever to find the most relevant products from the product descriptions. The BM25 component helps capture keyword-based matches, while the dense vector retriever (LanceDB) understands the semantic context of the query."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "kQS-HZJsuzUX",
    "outputId": "12615c3f-fbeb-42e5-fb61-a809fcc940bd"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting tantivy==0.20.1\n",
      "  Downloading tantivy-0.20.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.3 kB)\n",
      "Downloading tantivy-0.20.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB)\n",
      "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m4.1/4.1 MB\u001b[0m \u001b[31m32.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
      "\u001b[?25hInstalling collected packages: tantivy\n",
      "Successfully installed tantivy-0.20.1\n"
     ]
    }
   ],
   "source": [
    "!pip install tantivy==0.20.1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "yh1IdakP4yOa"
   },
   "source": [
    "read more about fts https://lancedb.github.io/lancedb/fts/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "EEvixPQiarAo",
    "outputId": "5d1f883e-1508-44ff-c060-870992770f68"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['Frodo was a happy puppy']\n"
     ]
    }
   ],
   "source": [
    "# example of FTS. when you want to serch whole text\n",
    "import lancedb\n",
    "\n",
    "uri = \"data/sample-lancedb\"\n",
    "db = lancedb.connect(uri)\n",
    "\n",
    "table = db.create_table(\n",
    "    \"my_tableasd\",\n",
    "    data=[\n",
    "        {\"vector\": [3.1, 4.1], \"text\": \"Frodo was a happy puppy\"},\n",
    "        {\"vector\": [5.9, 26.5], \"text\": \"There are several kittens playing\"},\n",
    "    ],\n",
    ")\n",
    "\n",
    "\n",
    "table.create_fts_index(\"text\")\n",
    "\n",
    "\n",
    "x = table.search(\"puppy\").limit(10).select([\"text\"]).to_list()\n",
    "\n",
    "\n",
    "texts = [item[\"text\"] for item in x]\n",
    "print(texts)"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
